Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Identifieur interne : 000641 ( Main/Exploration ); précédent : 000640; suivant : 000642Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Auteurs : Marco Turchi [Italie] ; Josef Steinberger [Italie] ; Mijail Kabadjov [Italie] ; Ralf Steinberger [Italie]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.
Url:
DOI: 10.1007/978-3-642-15998-5_7
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 002369
- to stream Istex, to step Curation: 002206
- to stream Istex, to step Checkpoint: 000221
- to stream Main, to step Merge: 000646
- to stream Main, to step Curation: 000641
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15998-5_7</idno>
<idno type="url">https://api.istex.fr/document/0E86CDCBC38DF61735F16B424BFE01559A3650F6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002369</idno>
<idno type="wicri:Area/Istex/Curation">002206</idno>
<idno type="wicri:Area/Istex/Checkpoint">000221</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Turchi M:using:parallel:corpora</idno>
<idno type="wicri:Area/Main/Merge">000646</idno>
<idno type="wicri:Area/Main/Curation">000641</idno>
<idno type="wicri:Area/Main/Exploration">000641</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<idno type="DOI">10.1007/978-3-642-15998-5_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.</div>
</front>
</TEI>
<affiliations><list><country><li>Italie</li>
</country>
</list>
<tree><country name="Italie"><noRegion><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
</noRegion>
<name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000641 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000641 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6 |texte= Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation }}
This area was generated with Dilib version V0.6.32. |